Statistical Consulting
Springer-Verlag, (2002): 393pp


Welcome! This page provides links to the datasets discussed in the case studies of the book Statistical Consulting.
Note that the name of the link corresponds to the subsection where the dataset was used. See the list below the table for a brief description of the case study, and access to unzipped versions of the data files. Additional information on the case studies can be found at: http://www.rci.rutgers.edu/~cabrera To order this book, go to:  Springer-Verlag Online Catalog

Chapter 6

 c61   c62   c63   c64 

Chapter 7

 c71   c72   c73   c74 

Chapter 8

 c81   c82   c83   c84 

Chapter 9

 c91   c93   c96   c97   c98 

Other

 Ch4   Course Outline  ESL


Questions? Send email to: mailto:mcdougal@pegasus.montclair.edu


Chapter 6

Group I. Simple case studies where the answer is well known, but with some interesting statistical and scientific issues to be discussed.

1.      c61  Job Promotion Discrimination.
Contingency tables
Two contingency tables related to a discrimination case are given.

2.      c62.zip  The Case of the Lost Mail.
Sample survey analysis.
Two datasets:  c62.survey.dat  contains the results of a survey (17k);%  c62.surinfo.dat  contains information about the sample used (10k). Use c62.zip to download both files directly.

3.      c63  A Device to Reduce Engine Emissions.
analysis of variance. The results from an investigation of a claim made by a manufacturer that its device would reduce car engine emissions.

4.      c64  Reverse Psychology.
Summary statistics.
The results of 72 physicians who attended an instructional workshop on the PANSS instrument (positive and negative syndrome scale) which can be used to assess a patient's psychological state. Their results may be compared to the those of the expert (KEY) rater.

Chapter 7

Group II. More complicated case studies where the statistical problem is generally well defined, but broader in scope than the Group I case studies. Several solutions may need to be evaluated.

1.      c71  The Flick Tail Study.
Logistic Regression.
The data consist of a response variable which gives the proportion of mice that flicked their tails after being administered a heat stimulus for 20 different combinations of two drug dosages.

2.      c72  Does It Have Good Taste?
Factorial Designs.
The data contains the results from a central composite experiment involving five control variables (factors) and four response variables (outputs). Objective is to determine what factor conditions provide the optimum response for these outputs.

3.      c73.zip  Expenditures in NY Municipalities.
Regression Modeling. The data  c73.dat (57k) consist of a common set of demographic and income-related predictors that can be used to develop a suitable regression model for predicting per capita expenditures in three New York State towns.

4.      c74  Measuring Quality Time.
Time Series Analysis.
Four years of monthly data of an overall quality score for a certain product are given. The objective is to obtain a good time series model for modeling this score.

Chapter 8

Group III. Research-oriented case studies where the statistical problem is not straightforward and the students have to do a lot of thinking. Several stages of analysis are typically needed to obtain suitable results. There may not necessarily be an ''answer'' to the statistical problem.

1.      c81  A Tale of Two Thieves.
Mixed Effects Models.
Making sure tablets contain the correct dosage is an important problem in the drug manufacturing industry. The data contains the results from an experiment conducted by a pharmaceutical company to investigate sampling variability and bias associated with the manufacture of a certain type of tablet.

2.      c82.zip  Plastic Explosives Detection.
Discriminant Analysis.
The file c82.dat (671k) contains 2500 profiles from an early X-ray machine prototype. Half the profiles correspond to plastic explosive substances and the remainder to substances typically found in suitcases. Several types of classification methods can be employed to develop a rule to discriminate between the two profiles. How reliable is your rule?

3.      c83  A Market Research Study.
Factor Analysis.
The objective of the study is to identify consumer segments with similar purchasing from catalogue profiles to those of the viewers of popular TV shows.

4.      c84.zip  Sales of Orthopedic Equipment.
Data Mining Applications to Market Research.
The objective of this study is to identify U.S. hospitals that currently buy our client's orthopedic equipment at a lower levels than would be expected. Data is contained in the file:  c84.dat  (601k)

Chapter 9

Case study exercises ... See you next week for the results?

1.      c91  Improving Teaching.
A study to investigate whether incorporating learning style preferences of 59 elementary students improved their understanding of three science courses.

2.      c93.zip  Left or Right?
The data consists of various measurements taken from flakes that were used as cutting tools by early hominids in the Pleistocene period --- 2 million years
BC! Previous studies have suggested that certain flake patterns were consistent with knapper handedness and hence that early hominids may have possessed cognitive reasoning abilities. This conclusion is subject to debate, of course, but leads to the interesting question of whether hand preference is discernible from this archeological record. Data is contained in the file:  c93.dat  (601k)

3.      c96  Bentley's Revenge.
A followup study to Case Study c63.

4.      c97  Wear what you like?
The focus of this study was whether the interaction between student and teacher differed according to the type of clothing worn by a student.

5.      c98  An AIDS Study.
Measuring the cell count of certain cells provides an effective means of monitoring patients who are affected by the AIDS virus. The purpose of this study was to see if the cell counts of three particular measures provided useful discrimination between two groups of couples infected with the AID virus.

Other

·        Chapter 4 data. 
The dataset used in A Project from A to Z.

·        Course Outline. 
An outline for a graduate course on statistical consulting.
The aim of this course is to expose students to realistic problems that appear in typical interactions between statisticians and scientists. The lectures are centered around case studies presented by invited speakers.

·        ESL. 
English as a Second Language (ESL).
Additional notes that were not included in the book. Some very basic information on common trouble spots for ESL students writing reports.